我们建议在线变更点检测的新过程。我们的方法扩展了一个想法,即在变更前和变换后分布之间最大化差异度量。这将导致一个适合参数和非参数场景的灵活过程。我们证明了程序的平均运行长度及其预期的检测延迟。通过关于合成和现实世界数据集的数值实验来说明该算法的效率。
translated by 谷歌翻译
This paper investigates the approximation properties of deep neural networks with piecewise-polynomial activation functions. We derive the required depth, width, and sparsity of a deep neural network to approximate any H\"{o}lder smooth function up to a given approximation error in H\"{o}lder norms in such a way that all weights of this neural network are bounded by $1$. The latter feature is essential to control generalization errors in many statistical and machine learning applications.
translated by 谷歌翻译
在这项工作中,我们对香草生成的对抗网络(GAN)的非渐近性质进行了彻底的研究。We derive theoretical guarantees for the density estimation with GANs under a proper choice of the deep neural networks classes representing generators and discriminators.特别是,我们证明了由此产生的估计会聚到真实密度$ \ mathsf {p} ^ * $以jensen-shannon(js)以$(\ log {n} / n)^ {2 \Beta /(2 \ beta + d)} $ why $ n $是样本大小和$ \ beta $ commentines $ \ mathsf {p} ^ * $的平滑度。据我们所知,这是使用Vanilla Gans的浓度估计的文献中的第一个结果,这些融合率比N ^ { - 1/2} $更快地在政权$ \ beta> D / 2 $中。此外,我们表明所获得的速率是考虑的密度类别的最低限度最佳(最高因子因子)。
translated by 谷歌翻译
Determining and predicting reservoir formation properties for newly drilled wells represents a significant challenge. One of the variations of these properties evaluation is well-interval similarity. Many methodologies for similarity learning exist: from rule-based approaches to deep neural networks. Recently, articles adopted, e.g. recurrent neural networks to build a similarity model as we deal with sequential data. Such an approach suffers from short-term memory, as it pays more attention to the end of a sequence. Neural network with Transformer architecture instead cast their attention over all sequences to make a decision. To make them more efficient in terms of computational time, we introduce a limited attention mechanism similar to Informer and Performer architectures. We conduct experiments on open datasets with more than 20 wells making our experiments reliable and suitable for industrial usage. The best results were obtained with our adaptation of the Informer variant of Transformer with ROC AUC 0.982. It outperforms classical approaches with ROC AUC 0.824, Recurrent neural networks with ROC AUC 0.934 and straightforward usage of Transformers with ROC AUC 0.961.
translated by 谷歌翻译
Markowitz mean-variance portfolios with sample mean and covariance as input parameters feature numerous issues in practice. They perform poorly out of sample due to estimation error, they experience extreme weights together with high sensitivity to change in input parameters. The heavy-tail characteristics of financial time series are in fact the cause for these erratic fluctuations of weights that consequently create substantial transaction costs. In robustifying the weights we present a toolbox for stabilizing costs and weights for global minimum Markowitz portfolios. Utilizing a projected gradient descent (PGD) technique, we avoid the estimation and inversion of the covariance operator as a whole and concentrate on robust estimation of the gradient descent increment. Using modern tools of robust statistics we construct a computationally efficient estimator with almost Gaussian properties based on median-of-means uniformly over weights. This robustified Markowitz approach is confirmed by empirical studies on equity markets. We demonstrate that robustified portfolios reach the lowest turnover compared to shrinkage-based and constrained portfolios while preserving or slightly improving out-of-sample performance.
translated by 谷歌翻译
Network Intrusion and Detection Systems (NIDS) are essential for malicious traffic and cyberattack detection in modern networks. Artificial intelligence-based NIDS are powerful tools that can learn complex data correlations for accurate attack prediction. Graph Neural Networks (GNNs) provide an opportunity to analyze network topology along with flow features which makes them particularly suitable for NIDS applications. However, successful application of such tool requires large amounts of carefully collected and labeled data for training and testing. In this paper we inspect different versions of ToN-IoT dataset and point out inconsistencies in some versions. We filter the full version of ToN-IoT and present a new version labeled ToN-IoT-R. To ensure generalization we propose a new standardized and compact set of flow features which are derived solely from NetFlowv5-compatible data. We separate numeric data and flags into different categories and propose a new dataset-agnostic normalization approach for numeric features. This allows us to preserve meaning of flow flags and we propose to conduct targeted analysis based on, for instance, network protocols. For flow classification we use E-GraphSage algorithm with modified node initialization technique that allows us to add node degree to node features. We achieve high classification accuracy on ToN-IoT-R and compare it with previously published results for ToN-IoT, NF-ToN-IoT, and NF-ToN-IoT-v2. We highlight the importance of careful data collection and labeling and appropriate data preprocessing choice and conclude that the proposed set of features is more applicable for real NIDS due to being less demanding to traffic monitoring equipment while preserving high flow classification accuracy.
translated by 谷歌翻译
Migraine is a high-prevalence and disabling neurological disorder. However, information migraine management in real-world settings could be limited to traditional health information sources. In this paper, we (i) verify that there is substantial migraine-related chatter available on social media (Twitter and Reddit), self-reported by migraine sufferers; (ii) develop a platform-independent text classification system for automatically detecting self-reported migraine-related posts, and (iii) conduct analyses of the self-reported posts to assess the utility of social media for studying this problem. We manually annotated 5750 Twitter posts and 302 Reddit posts. Our system achieved an F1 score of 0.90 on Twitter and 0.93 on Reddit. Analysis of information posted by our 'migraine cohort' revealed the presence of a plethora of relevant information about migraine therapies and patient sentiments associated with them. Our study forms the foundation for conducting an in-depth analysis of migraine-related information using social media data.
translated by 谷歌翻译
Triplet extraction aims to extract entities and their corresponding relations in unstructured text. Most existing methods train an extraction model on high-quality training data, and hence are incapable of extracting relations that were not observed during training. Generalizing the model to unseen relations typically requires fine-tuning on synthetic training data which is often noisy and unreliable. In this paper, we argue that reducing triplet extraction to a template filling task over a pre-trained language model can equip the model with zero-shot learning capabilities and enable it to leverage the implicit knowledge in the language model. Embodying these ideas, we propose a novel framework, ZETT (ZEro-shot Triplet extraction by Template infilling), that is based on end-to-end generative transformers. Our experiments show that without any data augmentation or pipeline systems, ZETT can outperform previous state-of-the-art models with 25% less parameters. We further show that ZETT is more robust in detecting entities and can be incorporated with automatically generated templates for relations.
translated by 谷歌翻译
Task-oriented dialogue (TOD) systems have been applied in a range of domains to support human users to achieve specific goals. Systems are typically constructed for a single domain or language and do not generalise well beyond this. Their extension to other languages in particular is restricted by the lack of available training data for many of the world's languages. To support work on Natural Language Understanding (NLU) in TOD across multiple languages and domains simultaneously, we constructed MULTI3NLU++, a multilingual, multi-intent, multi-domain dataset. MULTI3NLU++ extends the English-only NLU++ dataset to include manual translations into a range of high, medium and low resource languages (Spanish, Marathi, Turkish and Amharic), in two domains (banking and hotels). MULTI3NLU++ inherits the multi-intent property of NLU++, where an utterance may be labelled with multiple intents, providing a more realistic representation of a user's goals and aligning with the more complex tasks that commercial systems aim to model. We use MULTI3NLU++ to benchmark state-of-the-art multilingual language models as well as Machine Translation and Question Answering systems for the NLU task of intent detection for TOD systems in the multilingual setting. The results demonstrate the challenging nature of the dataset, particularly in the low-resource language setting.
translated by 谷歌翻译
Automatic machine translation (MT) metrics are widely used to distinguish the translation qualities of machine translation systems across relatively large test sets (system-level evaluation). However, it is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level (segment-level evaluation). In this paper, we investigate how useful MT metrics are at detecting the success of a machine translation component when placed in a larger platform with a downstream task. We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks (dialogue state tracking, question answering, and semantic parsing). For each task, we only have access to a monolingual task-specific model. We calculate the correlation between the metric's ability to predict a good/bad translation with the success/failure on the final task for the Translate-Test setup. Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes. We also find that the scores provided by neural metrics are not interpretable mostly because of undefined ranges. Our analysis suggests that future MT metrics be designed to produce error labels rather than scores to facilitate extrinsic evaluation.
translated by 谷歌翻译